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DNA bending on length scales shorter than a persistence length plays an integral role in the 
translation of genetic information from DNA to cellular function. Quantitative experimental studies 
of these biological systems have led to a renewed interest in the polymer mechanics relevant for 
describing the conformational free energy of DNA bending induced by protein-DNA complexes. 
Recent experimental results from DNA cyclization studies have cast doubt on the applicability of 
the canonical semifiexible polymer theory, the wormlike chain (WLC) model, to DNA bending on 
biological length scales. 

This paper develops a theory of the chain statistics of a class of generalized semifiexible polymer 
models. Our focus is on the theoretical development of these models and the calculation of experi- 
mental observables. To illustrate our methods, we focus on a specific toy model of DNA bending. We 
show that the WLC model generically describes the long-length-scale chain statistics of semifiexible 
polymers, as predicted by the Renormalization Group. In particular, we show that either the WLC 
or our new model adequate describes force-extension, solution scattering, and long-contour-length 
cyclization experiments, regardless of the details of DNA bend elasticity. In contrast, experiments 
sensitive to short-length-scale chain behavior can in principle reveal dramatic departures from the 
linear elastic behavior assumed in the WLC model. We demonstrate this explicitly by showing 
that our toy model can reproduce the anomalously large short-contour-length cyclization J factors 
observed by Cloutier and Widom. Finally, we discuss the applicability of these models to DNA 
chain statistics in the context of future experiments. 

PACS numbers: 87.14.Gg, 87.15. La, 82.35.Pq, 36.20.Hb 



I. INTRODUCTION 



The statistical mechanics of linear polymers has long attracted the attention of 
physicists and chemists alike. The mechanics of DNA is of considerable biological 
relevance to describing the free energy landscape controlling protein-induced DNA 
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bending. These protein-DNA interactions are of central importance to cellular func- 
tion on a microscopic scale, from chromosomal DNA packaging, to transcription, 
and gene regulation, to viral packaging 1]. Protein-DNA interactions typically 
induce short-length-scale DNA bending which couples the chemical and physical 
properties of DNA HHQ. 

A particularly important and successful application of polymer statistics has been 
in the description of double stranded DNA (dsDNA) by the wormlike chain model 
(WLC). In the WLC model, DNA is modeled as a fluctuating, linearly-elastic rod. 
This simple model has been remarkably successful in describing many aspects of 
DNA mechanics and the statistics of semiflexible polymers generally. In particular, 
WLC describes the extension of a single dsDNA molecule under an external force 
with impressive precision 

Despite the notable theoretical and experimental success of the wormlike chain 
model, recent DNA cyclization studies by Cloutier and Widom t| have cast doubt 
on the validity of the WLC model for describing the cyclization of short-contour- 
length sequences of DNA. In still more recent cyclization studies, Vologodskii and 
coworkers claim that the WLC model does accurately describe the cyclization of 
short DNA sequences Q- Nevertheless, as we will explain later a number of exper- 
iments do seem to point to a role for elastic breakdown in DNA mechanics. 

With the current experimental situation still in flux, it seems imperative to reeval- 
uate the WLC model theoretically. We wish to answer the questions: (i) How could 
such a simple theory hope to describe a complex molecule like DNA? (ii) More pre- 
cisely, which classes of experiments would we expect to be successfully described 
by WLC model, and which might require a different theory? Do these experiments 
correspond to the known successes or the recently reported failures of the theory? 
In other words, we are asking how much room do the classic tests of WLC model 
leave for generalization of this model, and how completely do these experiments 
test the WLC model? Finally, we must ask (Hi) Would a breakdown of the WLC 
model have any biological significance? 

The focus of this paper will be the theoretical analysis of these questions and the 
development and discussion of more general semiflexible polymer models. Although 
these ideas are widely applicable to polymers statistics in general, the focus of this 
paper will be exclusively the mechanics of DNA. We shall attempt a synthesis of the 
existing experimental knowledge to determine which aspects of DNA bending are 
probed by existing experiments. In particular, we determine which experiments are 
most sensitive to the DNA mechanics relevant for understanding biological systems. 
In the remainder of this introduction, we shall quickly outline our answers to the 
questions posed above. 
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A. Scale dependence in statistical physics 

First, to put the possible breakdown of the WLC model into perspective, it is 
helpful to consider the bending of macroscopic rods. To engineers in the mechanics 
community, whose work has been the study of macroscopic bending, the failure 
of a linear elastic model at high curvature is more pedestrian than remarkable. 
The linear elastic theory is understood to apply only to the small deflection limit. 
What is perhaps more remarkable to some is that a linear-elastic model describes 
a macromolecular polymer at all, let alone to the accuracy illustrated by force- 
extension measurements! 

To put the success of the WLC model into perspective, it is helpful to consider 
DNA mechanics from the viewpoint of the statistical mechanics of condensed mat- 
ter systems. Many physical properties of complicated condensed matter systems 
have been described by a small set of theories described in terms of renormalizable 
opera,™ & Re g art,ll of the coveted srtucrtae of «ae theory a, short ,e„gth 
scales, the Rcnormalization Group (RG) guarantees that the long-length-scale chain 
statistics will be described by an effective energy functional containing only a few 
terms. In fact, for semiflexible polymers, only one such "renormalizable" term ex- 
ists with the right symmetries. As a consequence, all semiflexible polymers share 
generic long-length-scale behavior: that described by the WLC model. Physically, 
this loss of information about the microscopic details of molecular mechanics arises 
from the averaging effect of thermal fluctuations. 

The RG world-view leads us to expect that experiments like measuring the force- 
extension relation of long DNA would reveal only generic behavior, insensitive to 
microscopic details of DNA elasticity. But, on short enough length scales, the un- 
derlying structure of the theory becomes important. Violations of the linear elastic 
theory, analogous to those observed in macroscopic bending, are therefore expected 
in experiments that probe the short-length-scale bending of DNA. Indeed, early 
AFM imaging experiments did see the onset of deviations from WLC expectations 
on short scales Cyclization experiments, like the ones in Refs. 0] , hold the 
prospect of greater sensitivity to the high-curvature regime. 

B. Summary of this paper 

This paper develops the qualitative framework outlined above by introducing a 
generalization of the wormlike chain model. This class of models, introduced in 
Sect.^ generalizes the WLC by describing a semiflexible polymer by an arbitrary 
local bending energy function. Sects. Ill CIIII Dl introduce an explicit toy model 
of DNA bending, the "Sub-Elastic Chain" model (SEC), motivated by imaging 
data on DNA adsorbed to mica [ly|, and by recent nanoscale force measurements 







Sect. Ill El illustrates a computational procedure for computing the tangent 
distribution function for arbitrary contour length in generalized theories. Sect. Ill Hi 
introduces the persistence length in generalized theories and shows that these the- 
ories converge to the WLC model at long contour length. 

The remainder of the paper focuses on the spatial distribution of the polymer. 
The spatial distribution is of particular importance for biological applications where 
the contribution of chain statistics to biological function can often be formulated in 
terms of an effective end-concentration, the Jacobson-Stockmayer factor (J factor). 
Physically, this effective concentration is the probability density of the polymer 
having the correct configuration for binding to the binding site of a protein. Sect. lIII 
introduces a method for computing the spatial and tangent-spatial distributions 
of generalized semiflexible polymer models in terms of a framework developed by 
Spakowitz and Wang Q, Q, Ql an d others [l^. Sect. IIII Al explicitly computes 
the spatial distributions for both the SEC and WLC models for various contour 
lengths. We discuss the Renormalization Group applied to spatial distributions 
and show the predicted convergence of the SEC and WLC models at long contour 
length. Sects. IIII Bl and IIII CI show that the force extension and the structure 
factor computed for general theories are nearly identical to the WLC model results, 
implying that these experiments do not probe the high-curvature chain statistics 
important for many biological processes. Sect. IIII Dl computes the cyclization J 
factor for generic theories. We show that the SEC model gives rise to the enhanced 
cyclization efficiency for short-contour-length sequences observed by Cloutier and 
Widom 6] while leaving the long-contour-length J factor identical to that predicted 
by the WLC model. Finally, we discuss the results of this paper in the context of 
the recent cyclization measurements of Vologodskii and coworke rs III and recent 
measurements of the deflection force for short sequences of DNA [ll| . 



C. Relation to other recent work 

Following Crick and Klug's initial suggestion that DNA might kink Q|, many 
classical works investigated the structural implications of this conformational 
change at the single-basepair level (reviewed in |l7|). Indeed, many known DNA- 
protein complexes do display kinks in the DNA backbone. In contrast, our focus 
here is on physical measurements of DNA mechanics on a mesoscopic, several- 
basepair scale relevant for biological processes like DNA looping in vivo. As de- 
scribed in Sect. Ill CI both Yan and Marko and we previously formulated and solved 
"kinking" models, in which DNA is assumed to undergo a sudden loss of bend 
stiffness beyond a critical stress Q Q- 

Other related models were also for- 



mulated and solved in Refs. 21] and 22]. Sucato et al. have also performed Monte 
Carlo simulations of kinkable chains, to obtain information about their structural 
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and thermodynamic properties [2^. Unlike these prior articles, the present work 
explores the proposal that the breakdown of linear elasticity, when coarsegrained 
to the mesoscopic scale, is effectively less abrupt than in the kinking models. We 
suggest that such a model can reconcile the growing evidence for elastic break- 
down with the generic absence of sharply kinked states when tightly-bent DNA is 
observed microscopically. 

II. DEFINING DISCRETE LINK THEORIES 
A. Local energy functions 

In this paper, we discuss a class of generalized elastic models for the statistical 
mechanics of semifiexible, inextensible polymers. The theories we discuss will be 
applicable to the description of polymers on length scales longer than the scale of 
the molecular structure. Accordingly, we idealize a semifiexible polymer as a chain 
(or "rod") consisting of N discrete segments ("links"), each of length £, joined by 
semifiexible hinges ("vertices;" see Fig.^l. The link length I should be taken to be 
shorter than the scale of the experiment we wish to describe, for example, shorter 
than the total length of the DNA in a cyclization experiment. 

We then introduce a coarse-grained free energy cost for each chain configuration 

N 

where Ej is the energy associated with vertex j. To make a connection with the 
continuum mechanics picture, it is convenient to write this vertex bending energy 
as an energy density e: 

E j =ee(...,t j ^ 1 ,t j ,...;j), (2) 

that is a function of the N tangent vectors {ii, . . . , t/v} and the vertex number j. 
The coarse-grained configurational free energy E is a combination of entropic and 
energetic parts, which depend on the underlying molecular structure of the polymer. 
We will ignore the effects of excluded volume, since we will be principally interested 
in bending on length scales where self-interaction effects play a negligible role in 
describing the chain conformation. We will also not allow for long-range interactions 
(that is, longer than £); for instance, we assume that the solution conditions fix an 
electrostatic screening length smaller than I. 

To focus our attention on the novel effects of the hypothesis of elastic breakdown, 
we will restrict Eq. ^ to a subclass of models by making some assumptions about 
the form of the free energy density e. Although this subclass is not a fully realistic 
depiction of known properties of DNA, it does have the virtue of being analytically 
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FIG. 1: Link and vertex numbering. The energy is a function of the deflection angles. 
The deflection angle between links i and i + 1 is 9i. 

tractable. Features of real DNA neglected in our models can be introduced in more 
numerical approaches once the phenomena we study are appreciated. 

First we shall assume that the free energy density does not explicitly depend on 
the position j (the chain is homogeneous). Strictly, this is not the case for DNA 
since both the helical pitch and the sequence dependence spoil homogeneity 0, Q . 
We will study the mechanics of DNA on length scales longer than the helical repeat 
(3.6 nm), where helical effects approximately average to zero. Sequence dependence 
is a more serious omission but we make this approximation in order to get 
analytical formulas. Having agreed to neglect helical effects, it is reasonable to add 
the assumption that the theory is rotationally invariant (the bending stiffness is 
isotropic). Last, we assume that the energy density involves only the first derivative 
of t, the discretized version of curvature: 



By rotational invariance, the energy density is a function of the magnitude of the 
curvature only: 



The most questionable of the assumptions above is the dropping of higher deriva- 
tives, which we will call "locality." For example, the role of nonlinear elastic-energy 
terms such as k 4 will be a central concern of this paper, but this term has the same 
dimension (length) -4 as a term like (Ak/As) 2 , which we drop. Our justification 
is that higher-derivative terms correspond physically to cooperative conformational 
changes along the polymer, and although there are hints of long-range coopera- 
tivity in DNA 24], still the phenomena addressed in this paper do not seem to 
require such behavior. Again, our restricted class of theories is an analytically 
tractable starting point for the study of the effects of nonlinear-elastic behavior 
in an entropy-dominated chain. We will return to this point again briefly in the 
Discussion fSect.UVjl. 

An important one-parameter family of polymer theories is described by an energy 
density that is quadratic in the curvature: 



k(s) = At/ As where Sj = j£. 



(3) 



e = e(k), where Kj — 6j/£ « 



(4) 




(5) 
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The models described by Eq. [5] have a restoring torque —de/dO that is linear in 
the link deflection 9: they are linear-elasticity theories. Their continuum limits are 
called wormlike chain models. A WLC model is completely characterized by one 
number, the elastic bending modulus t^k^T. 

B. Statistical mechanics 

We define the statistical mechanics theory associated to an energy function e(s) 
in the canonical way. The probability of a coarse-grained chain conformation is 
given by the Boltzmann law: 

V = Z~ x exp(-E), (6) 

where Z is the partition function, determined by normalization, and we have defined 
the cffcctivc-coarsc-grained free energy E in units of fc^T. 

The link length £ plays two key roles in Eq.HJ First, it defines the coarse-grained 
configuration space: a chain of physical length L tot consists of N = L tot /£ links, 
each with its orientation variable tj. Second, £ enters Eq. ^explicitly. The physical 
meaning off is not obvious, however — it does not correspond to any crystallographic 
length in the DNA structure, for example the basepair rise of 0.34 nm. In fact, 
strictly speaking £ is not a parameter of the theory at all, because two different 
values of £ can give rise to theories with identical predictions, if the two theories' 
energy functions e are suitably adjusted. Instead, £ is needed to give meaning to 
the energy function e. The adjustment needed to maintain a fixed theory as £ is 
changed is called the "renormalization group flow" of e jsj • 

It may seem tempting to eliminate t from the theory by attempting a continuum 
limit, I — ► 0, and indeed continuum mechanics does just this. When fluctuations 
are important, however, the continuum limit can discard some legitimate physics, 
and so must be treated with caution. In fact, we will argue that in the polymer 
context, the continuum limit leads only to a subset of models corresponding to the 
WLC (Eq. EJ) , because there is only one renormalizable term in the energy density 
with the right symmetries to meet our assumptions. 

The fact that more general energy densities always have continuum limits de- 
scribable by WLC models does not mean, however, that the WLC exhausts the 
physically legitimate and interesting models for stiff polymers. After all, we cannot 
expect continuum elasticity to remain valid on molecular length scales. Rather, 
this observation only implies that models more general than WLC must be defined 
with respect to some finite length scale £. 

The assumptions we made in Sect. Ill Al imply that the partition function for 
an unconstrained N link chain decouples into N — 1 factors of the single-vertex 
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partition function: 

q = J dt l+1 e~ E % (7) 

where ti+i is the out-going tangent of link vertex i (see Fig.QJ. In the expression 
above, J dU+i denotes an element of the d— 1-dimensional sphere of unit vectors in 
d = 2 or 3 dimensions; both two and three dimensions are of experimental interest. 

We now introduce the fundamental tangent distribution function. The tangent 
distribution function is the conditional probability density for the final tangent, 
given an initial tangent. The fundamental tangent distribution function is the 
distribution function over just one link, length i, and is related to the vertex energy 
by the Boltzmann Distribution (Eq. 

giU+u^^q^e-^, (8) 

where t{ and tj+j are the initial and final tangent respectively and the deflection 
angle at vertex i is given by cos#i = U ■ tj+i. The chain statistics of the theory are 
completely determined by the fundamental tangent distribution. 



C. Sub-Elastic Chain Model 



We now introduce an explicit toy model for DNA bending that differs dramati- 
cally from the WLC model. Although the symbolic results below can be applied to 
the analysis of any of the semiflexible polymer models specified by Eq. 0J we will 
illustrate the method by using the toy model to compute experimental observables 
like force extension, the cyclization J factor, etc. for an explicit generalized theory. 
The model we will study has a bending energy that is softer for high curvature 
than the WLC model. Nevertheless, it reproduces the successful long-length-scale 
predictions of WLC. 

We have already described one such model in an earlier paper and a similar 
model was also proposed by Yan and Marko In both cases, the high curva- 

ture softening was introduced by allowing kinking, or curvature localization: beyond 
a critical strain, the DNA's resistance to bending was supposed to fall suddenly to 
zero, or some small value. Although these kinking models reproduced the two de- 
sired features mentioned above, they predicted that highly curved DNA should be 
generically kinked |ld |. However, atomic force microscopy (AFM) imaging of small 
minicircles generically shows them as round (although kinking can be induced in 
unusual ionic conditions) [25I l^ . Moreover, tightly looped DNA shows enhanced 
sensitivity to DNAse digestion that is not concentrated on a single kink point, 
but rather is spread throughout the loop 27] . Finally, recent molecular-dynamics 
simulations of DNA minicircles show the spontaneous formation of sharp bends 
without strand separation For these reasons, this paper will explore a class 
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of models with nonlinear DNA elasticity but without the catastrophic breakdown 
characteristic of kinking. 

The bending energy functions we will study comes from the observation, well 
known in continuum mechanics, that a rod bending energy densit y th at is non- 
convex in curvature induces kinking when the rod is strongly bent 29]. To avoid 
kinking, we must therefore require that our effective bending energy density be 
everywhere a convex function of curvature, at least on length scales observable via 
electron microscopy (EM) or AFM imaging. 

A simple choice of bending energy function that meets all the conditions men- 
tioned, but is radically different from the WLC model, is: 

e( K )=A\ K \. (9) 

which defines a family of models parameterized by A and t that we call "sub-elastic 
chain" (SEC) models. We will show that taking A = 5.3 and £ = 5nm gives rise 
to a model with the persistence length £ = 53 nm needed to describe the long-scale 
behavior of DNA in moderate-salt solution [^l. As a final motivation, AFM studies 
of the tangent-tangent correlation of DNA adsorbed to mica appear to fit a bending 
energy of roughly this functional form [lflj . 

The SECmodcl illustrates several of the points we wish to make. In particular, it 
is clear that high curvature, the bending stiffness in this SECmodel is softer than 
the corresponding WLC model at high deflection (we shall show that the persistence 
length of, and the energy is nowhere non-convex. Nevertheless, we will see that the 
SEC model reproduces essentially the same behavior as WLC for those aspects of 
DNA mechanics that have been well tested. 

We emphasize that the SEC model defined by Eq. is a toy model, and not 
intended as a realistic, accurate representation of DNA. In particular, the nonan- 
alytic behavior of Eq. [5] at k — is not meant to be taken literally. Instead, it 
illustrates our calculational method, and our larger point that the classic DNA- 
mechanics tests underdetermine even the coarse-grained effective DNA mechanics 
on biologically relevant length scales. 



D. Measurements of the short-length-scale bending energy 

The force required to tightly bend short sections of DNA has recently been di- 
rectly measured by Liphardt and coworkers |l l| via a fluorescence resonance energy 
transfer (FRET) force sensor. In this experiment, a sequence of DNA 9.18 nm in 
length is tightly bent by a linking sequence of single-stranded DNA as illustrated 
in Fig. |21 This contour length is represented in our theory by two links [t = 5 nm) 
and a single vertex. The deflection angle is roughly 27r/3. It is straightforward to 
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FIG. 2: Measuring the deflection force of highly-bent short sequences of DNA using a 
FRET force sensor fill ] . Cyclized sequences of single-stranded DNA are hybridized with 
shorter complementary sequences. The single-stranded region of DNA acts as a force 
sensor. The external force is measured by the FRET efficiency of FRET dyes (D and A) 
positioned at either end of the force sensor. For a rough estimate, we model this molecule 
as two links under a deflection force load / induced by the single-stranded DNA linker. 
The deflection angle is roughly 2n/3 since the single-stranded DNA is roughly the same 
length as the link length. 



estimate the deflection force in both the discrete WLC and SEC models: 

/sec « t - « 5.5 P N (10) 
I cos 7T/6 

/wlc « T% ^ /R «25pN. (11) 
t z cos7r/6 

(In this estimate, we have used i = 4.6 nm, half the contour length of the dsDNA.) 
The experimentally measured force, 6 ± 5 pN, is approximately equal to that pre- 
dicted by the SECmodel but is more than a factor of two smaller than that predicted 
by the elastic rod model (WLC). These experiments again indicate that the WLC 
model fails to describe the high-curvature bending of short sequences of DNA. At 
least at this deflection angle, the SEC model approximately predicts the deflection 
force. Note that if the kinking model of Refs. [lcl Q] literally described short 
sequences of DNA, this force would be zero, contrary to the experiment — another 
motivation for our introduction of generalized elasticity models. 



E. The propagator and composition 

The locality assumption in the definition of the bending energy implies that 
each vertex bends independently. The fundamental tangent distribution function 
is the conditional probability of a final tangent, given an initial tangent for a single 
vertex. Computing the tangent distribution functions for chains of several links is 
therefore straightforward. These conditional probabilities are simply the product 
of conditional probabilities for single vertices, summed over the orientations of the 
intermediate tangents [2^] 

G{t, f; Ni) = J dh...dt N - 2 g{t-t x ) g{t i; t 2 )...g(t N _ 3 ;tN-2\ g{t N -2] ?), (12) 

N-2 N-2 
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where we have written the N link tangent distribution function as a function of the 
arc length, Nt. This notation is needlessly cumbersome. It is therefore convenient 
to introduce the propagation operator (or transfer matrix feflj ) 

g= I dtdt' \t) g(t,t')(t'\, (13) 

where (| and I ) is the canonical bra ket notation of statistical mechanics (or quantum 
mechanics) 31]. These states are a continuum basis: 

(t\t')=5[t-P], (14) 

where S is the Dirac delta function on the space of unit tangent vectors. 

The propagation operator, Q, applied on a state gives the state (probability 
distribution) after one additional link. This property is called composition and is 
a direct result of the locality discussed above. We can now rewrite Eq. ^| more 
concisely 

G{t-t'-Nl) = (t\ Q...Q \t') = (t\ G N \t') , (15) 

where the weighted sum, or path integral, over all intermediate configurations is 
now implicit. By changing the basis in the next section, we shall show that this ex- 
pression is also a convenient computational tool for understanding general theories 



F. Symmetry considerations 



The tangent basis we have exploited to write the tangent distribution function is 
not particularly convenient computationally since the operator is not expressed in 
its eignenbasis in which it is diagonal. To find an eigenbasis for this operator, we 
exploit the rigid body rotational invariance of the tangent distribution function. In 
D dimensions, the rigid-body-rotational invariance of the model implies that the 
propagator commutes with the generators of rotation 

[a,Ai] = o, (i6) 

where <Cy = — Cji are the generators of rotation in the ij plane. The propagator 
therefore also commutes with the Casimir operator, which in Quantum Mechanics 
would correspond to the total angular momentum: 

D 

£■ — 2 ^-ij^-ij- (17) 

Since C 2 and Q commute, they share an eigenbasis ^3lj | . The angular momentum 
states span the tangent space and are eigenvalues of C 2 : 

C 2 \lm) =l(l + D-2)\lm), (18) 
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where I is the total angular quantum number and we write the other angular quan- 
tum numbers collectively as m. The propagator can therefore be expanded in this 
eigenbasis ;2Q] 



G = Ya l \lm)(lm\, (19) 



im 



where the gi are coefficients that depend only on the quantum number I but not on 
m. Eq. E||is the desired diagonalization of the propagator Q. 

Explicitly, in two dimensions, it is convenient to use the eigenfunctions 

(i*|im) = — L=exp(-im0), (20) 
v 2ir 



for integer m and I = \m\. Note that the quantum number m is sufficient to describe 
the state but we have introduced a second quantum number, I, which is invariant 
under a generalized notion of rotational invariance in two dimensions, including the 
discrete transformation 9 — ► —9 (parity inversion). 

In three dimensions, it is convenient to use the eigenfunctions |sij 

(t\lm) = Y lm (6,<t>), (21) 

where the Yi m are the spherical harmonics. In this case m is the eigenvalue of the 
z component of the angular momentum operator £12. 

The orthonormality of the basis implies that the gi are uniquely determined 
and can be found in the usual way (Appendix [3] Eq. IA1I and Eq. IA2|) . It is now 
straightforward to perform the N+l link path integral of Ea.1121 



8 N = J2(9l) N |im> <H , (22) 

lm 



since the propagation operator is diagonal. 

We return now to the SEC model proposed in Sect. Ill CI The N link tangent 
distribution function for the SECmodel is shown in Fig. This figure explicitly 
illustrates the scale dependence of statistical mechanics theories. For short-contour- 
length chains, the WLC and SECtheories make dramatically different predictions, 
but as the contour length of the chain increases, the differences between the dis- 
tribution functions of the two theories decrease until at long contour length, the 
theories are essentially indistinguishable. This is the essence of the renormalization 
group: at short length scales, the mechanics of the chain can be extremely com- 
plicated but the thermal fluctuations sum over many intermediate configurations 
and hide the underlying complexity on longer length scales. We shall show this for 
general theories in Sect. Ill Hi 



G. Contour length continuation 



Since we will frequently be interested in the properties of the polymer on length 
scales much longer than the fundamental link length £, it is useful to introduce a 
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FIG. 3: Evolution of the 3-dimensional tangent distribution function GsD(t;t ';L) with 
increasing separation L. In the figure above, the WLC and SECtangent distribution 
functions are plotted as a function of the deflection angle 9 for several contour lengths. The 
linear dependence of SECbending energy on the deflection angle, visible in the fundamental 
distribution function (L = I = 5 nm), is lost at longer contour length. For L 3> £, the 
tangent distribution approaches the WLC distribution function with a persistence length 
of 53 nm despite dramatically different behavior at short contour length. This loss of the 
short length structure of the tangent distribution function is universal and explains the 
success of the WLC model in describing many semiflexible polymer phenomena. 




Hamiltonian operator defined by 

H= -£- 1 logg = ^hi\lm){lm\. (23) 

H is also diagonal in the angular momentum representation with eigenvalues hi = 
—i^ 1 log gi. We call this operator the Hamiltonian operator because in the WLC 
model, the statistical mechanics of the chain corresponds to a quantum particle 
on a D — 1 sphere. The tangent distribution function is equal to the quantum 
propagator where time has been continued to imaginary arc length. The operator 
TC is equal to the Hamiltonian of the corresponding quantum mechanical particle 
system. 

We can rewrite the N link as propagator 

G N = cxp(-WW), (24) 

where N£ is the contour length corresponding to N links. The advantage of this 
reformulation of the distribution function is that it introduces a natural extension 
to fractional numbers of links by replacing Nt by the contour length L defined for 
all positive real numbers: 

g (L) = exp(-WL), (25) 

although rigorously, it is understood that this function is only denned for contour 
lengths equal to an integral number of links. 
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H. The meaning of persistence length, and the stiff-polymer limit 

What is the meaning of persistence length in general models like the ones we 
have described? Persistence length describes the length scale on which the polymer 
maintains its tangent orientation. For the WLC model in D dimensions, the tangent 
persistence is 



where £ is the persistence length, which also appears in the energy density as the 
bending modulus in Eq.[5] In general models, the tangent persistence fEa. I26fl has 
the same functional form but £ no longer corresponds to a bending modulus. We 
shall therefore simply use Eq.EElto define the persistence length of general models. 

The tangent persistence corresponds to the I = 1 mode of the propagator. (In 
the quantum mechanical correspondence, t is a vector and creates a state of spin 
one.) Comparing Eas. 1251 and 1261 the persistence length is 



where h\ is the / = 1 eigenvalue of the Hamiltonian. Note that we have explicitly 
written a subscript D to denote the dependence on dimension. In the WLC model, 
£ is independent of dimension, but in more general models this is not the case. 

The persistence length also controls the long-length characteristics of the poly- 
mer. The mean-squared end-to-end distance can be written in terms of the tangent 
persistence 



Since Eg. 1261 applies to both the WLC model and general models, the dependence 
of the mean squared end-to-end distance on persistence length and contour length 
is identical to the WLC model. The same is true for radius of gyration, which can 
also be written in terms of an integration of Eq. It is also well known that the 
long-contour-length spatial distribution of semiflexible polymers is described by the 
Gaussian Chain model (Q ■ The width of the Gaussian distribution is determined 
by the mean squared end-to-end distance; the relation between the Kuhn length 
and the persistence length is therefore the same for our general models as for the 
WLC model. 

We can immediately exploit Eq. [23 and Eq. IA2I to analyze the SECmodel. The 
persistence length, computed for the SECmodel in three dimensions is 53 nm which 
matches solution measurements. 

Stiff-polymer limit: We now examine the tangent distribution function in the stiff 
polymer limit and show that the WLC model is universal at long contour length 
as predicted by the Renormalization Group. Our explicit computations of the 



(f(0) • t(L)) = cxp [—L(D - l)/2£] , 



(26) 



£ D = (D- l)/2h u 



(27) 




(28) 
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SECtangent distribution function in Sect. Ill El have already provided one explicit 
example of this behavior, but we address this question generally in this section. 

By definition the stiff polymer limit implies that the fundamental tangent distri- 
bution function, g, is narrowly distributed around zero deflection. We will exploit 
this fact by expanding the basis functions in the deflection angle and computing 
the eigenvalues of the propagator (Eq. I19fl to lowest order in the deflection angle. 
In dimension D, this calculation, although straightforward, requires some technical 
mathematics. We therefore relegate the details of this calculation to the appendix, 
Sect. IB] and present only the results. 

The propagator in the stiff polymer limit is Eq. IB 131 

G = l-^C 2 +O[C\£/0% (29) 

where £ is the persistence length defined by the I = 1 eigenvalue of the Hamiltonian 
operator (Ea. l27fl . Note that the C? term is understood to be small for small values 
of the angular quantum number I since, in the stiff polymer limit, the link length 
I is much shorter that the persistence length £. The corrections are order C A (0 4 } 
and scale as I 4 for large I. Clearly this approximation holds only for small angular 
quantum number I. It is convenient to compute the Hamiltonian operator 

H=^C 2 + 0[C\l/£)\ (30) 

which is identical to the WLC Hamiltonian to lowest order in the deflection angle. 
Again, the correction scales like Z 4 which implies that this relation holds only for 
small angular quantum numbers. 

The correspondence between the Hamiltonian operators for general models and 
the WLC model for small angular quantum numbers implies that the long-contour- 
length behavior of the polymer is universal and determined by the persistence length 
alone. This correspondence is shown explicitly for the SECand WLC theories in 
Fig. 0] At long contour length, only states with small I contribute since higher-Z 
contributions decay quickly. Remember that the propagator is 

g = exp(-HL), (31) 

and the eigenvalues of Hwlc scale as Z 2 for large Z. The tangent distribution 
function is therefore well approximated by the WLC model at long contour length: 



\im g(L)=g WLC (L). (32) 

The details of the short-length-scale bending energy affect only the large Z eigenval- 
ues of the Hamiltonian operator and are therefore irrelevant at long contour length, 
as predicted by the renormalization group. 

Although we have yet to compute the spatial distribution function, we have ex- 
plicitly shown that measurements that are only sensitive to the long-length-scale 
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FIG. 4: The eigen-spectrum of TL for the SEC and WLC models. The eigenvalues of 
the Hamiltonian operator for the WLC and SEC theories are compared as a function 
of the angular quantum number I. Both theories have an identical persistence length, 
£ = h^ 1 = 53 nm. The eigenvalues of the Hamiltonian are coincident for small I but 
diverge as I increases. The 2th moment of the distribution function decays as exp(— hiL). 
The larger eigenvalues of the Hamiltonian, for which the two theories differ, are therefore 
relevant only for small L, implying that the SEC and WLC chain statistics are identical 
for long-contour-length chains. 



chain statistics do not determine the short-length-scale behavior of the theory and 
that violations of the wormlike chain model, while disguised by thermal fluctua- 
tions at long contour length, are generic as the length scales probed by experiment 
approach the fundamental or structural length scale of the chain. 



III. THE SPATIAL DISTRIBUTION 



For most applications, it is the spatial distribution of the polymer rather than the 
tangent distribution function which is of phenomenological interest. From solution 
scattering to force-extension, cyclization to looping, the spatial distribution function 
is directly observable. In this section, we shall develop a near exact formalism for 
computing the spatial distribution function. Our focus will be exclusively three 
dimensions but computations in other dimensions are a simple extension of the 
methods discussed here. 

The tangent-spatial distribution function is the probability density of end dis- 
placement X and final tangent tf given an initial tangent £j for an arc length L 
chain. It is convenient to write the tangent spatial distribution in terms of the 
spatial delta function 

G(X; t f ,t t ; L) = (t f \ exp [~HL] 5 3 [X - [ ds f(s)] \u) , (33) 

Jo 

where we have written the distribution function in the continuum limit. We shall 
reintroduce an operational definition of this continuum limit in a moment. 
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To compute the tangent-spatial distribution function, we introduce an operator- 
valued spatial distribution function [l^ : 

Q{X; L) = J dtdt' \t) G(X; t, t'; L) (t'\ , (34) 

which allows us to keep the tangents implicit in our expressions. We shall call this 
operator the spatial propagator since it obeys the composition property of Green 
Functions: 

g(X;L + L') =g(X;L)®g(X;L'), (35) 

where denotes the spatial convolution. 

As before, it will be convenient to work in the angular momentum basis with the 
matrix elements 

G lmVml (X; L) ee (I m \ G{X; L) \l' m') , (36) 

since this basis diagonalizes the Hamiltonian (although not the spatial propagator). 
Finding the spatial propagator reduces to the ability to explicitly compute all the 

Glml'm' • 

We shall be able to derive exact expressions for the Fourier-Laplace Transform of 
the spatial propagator in the continuum theory in terms of the transformed matrix 
elements fEa. I36|) . We adopt the Fourier Transform convention 

G(k;L) = T G(X;L)= [ d 3 X G(X;L) cxp{-ik-X) } (37) 

X^k J 

and the Laplace transform convention 

/>OC 

G(k;p) = C G(k;L)= / dL G(k; L) exp(-pL). (38) 
L ^p Jo 

The derivation of the transformed matrix elements exploits the same techniques 
used recently by Spakowitz and Wang (3 EH Q to derive exact results for the 
WLC model. The extension of these results to the generalized theories consid- 
ered here is straightforward. We shall therefore include only a brief derivation in 
Appendix IU1 although we discuss the results in the main text. 

It is important to note at this stage that the results derived for the spatial dis- 
tribution, although derived in a method analogous to that exploited in Ref. [l^ . 
will not be exact solutions to generalized discrete-link models. Rather, the results 
quoted here are exact-solutions to the analytically continued theories defined by 
Eq. [53 That is, we have assumed a formulation of the discrete- link theories that 
defines the tangent distribution function for all L > 0, although formally this dis- 
tribution function is defined only for contour lengths equal to an integral number 
of links. For semiflcxible polymers longer than a few links, this is an excellent ap- 
proximation. (For instance, the discrete and continued theories are later compared 
in Fig. 03) We have therefore called this solution "near-exact" in the text. 
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A. The spatial distribution function 

In force-extension and solution scattering experiments the tangents of the poly- 
mer are not directly probed by experiment; it is only the spatial distribution func- 
tion rather than the tangent-spatial distribution function which is observed. We 
shall therefore introduce the spatial distribution function, K(X; L), which is defined 
as the probability density that a contour length L polymer has end displacement 
X. The spatial distribution function is the tangent-spatial distribution function 
summed over the final tangent and averaged over the initial tangent: 

K(X; L) = ^-f dtfdti G(X; t f , U; L) = G ooo(X; L), (39) 

where the last equality expresses the spatial distribution function in terms of a 
matrix element of the spatial propagator. 

The Fourier-Laplace transform of this matrix element, a continued fraction, is 
computed in Eq. IC15I The explicit expression for the transformed spatial distribu- 
tion function is 

K(k;p) = 1 —— 2 , (40) 

Bik z 

p + hi + 

where the hi are the eigenvalues of the Hamiltonian operator, Eq. |23 and the B n 
coefficients are defined as 

This expression is identical to that derived for the WLC model ^^|, except that the 
eigenvalues of the Hamiltonian operator, hi, are those for a generic theory rather 
than the WLC eigenvalues. Otherwise the expression is unchanged. 

The spatial distribution functions for the WLC and SEC models are plotted in 
Fig.0for several contour lengths. The numerical techniques applied in this compu- 
tation are described in Appendix[D] These results again display the renormalization 
group flow toward the WLC model at long contour length. Although the two the- 
ories make dramatically different predictions for short-contour- length chains, the 
predictions coincide at long contour length! 

The suppression of the short- length-scale structure of the theory can again be un- 
derstood in terms of the eigenvalues of the Hamiltonian operator. The levels of the 
continued fraction (Eq. I4(J|I can be understood as contributions from transitions to 
states of increasing angular quantum number I. But these high-angular-momentum 
states decay quickly due to their large eigenvalues of the Hamiltonian. We can also 
understand the irrelevance of high-angular- momentum states at long length in terms 
of the wave number k. Long length scales correspond to small wave number. The 
levels of the continued fraction are multiplied by k 2 and are therefore suppressed 
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R/L 

FIG. 5: The spatial distribution for the WLC and SECtheories. All curves except the 
black dotted curve have been computed using the inverse transform technique. To check 
the validity of the validity of the technique, the black dots show a direct Monte Carlo 
integration for the shortest contour length SECcurve (red). We have chosen the contour 
lengths of the chains to illustrate two types of renormalization. At 50 nm for large de- 
flection (R/L ~ 0), the SEC (solid) and WLC (dotted) theories differ by two orders of 
magnitude. For a 200 nm contour length, SECand WLC predict nearly identical distri- 
butions, but this distribution is clearly not Gaussian. For long contour length, however 
these theories renormalize to the Gaussian chain model (dashed). 



for small wave number, implying that the higher-angular-momentum states have 
successively less relevance at long length scales. 

It is also instructive to consider the long-length-scale limit of the spatial distri- 
bution function since we know that this limit is described by the gaussian chain 
model. The long-length-scale limit corresponds to the limit of small k and contour 
dual number p. In this limit, the transformed spatial distribution function is 

K(k; P )^ T^T- ( 42 ) 

which is just a Gaussian distribution with a Kuhn length of twice the persistence 
length (Eq. I26fl as we have already argued from computations of the mean-squared 
end-to-end distance and has also been shown schematically for the SEC model in 
Fig.0 

B. Force-extension 

The force-extension of single polymer molecules has long been the subject of 
experimental interest 

HQ 

The experimental observable in these experiments, 
the extension of the polymer under an external force, can be directly computed from 
the spatial distribution function. Typically this force is applied to a bead, tethered 
to the polymer, using an optical or magnetic trap |5l l35i l36|. The restoring force 
against extension is entropic in nature (for inextensible polymers). This entropic 
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fC/kT 

FIG. 6: Force-extension for the WLC and SECmodels compared with experimental mea- 
surements [sojl . The WLC model was fit to the experimental data to determine the contour 
length and persistence length (£ = 53 nm). Despite the dissimilar short-length-scale tan- 
gent distribution function, the behavior of the polymer under an external force is nearly 
identical. For forces greater than 10 pN, intrinsic stretching stretch becomes important, 
obscuring the entropic part of the response. 



force is induced by the reduction in the number of micro configurations available 
to the chain as the extension is increased. 

The successful comparison of WLC to single-molecule force-extension data has 
been described as the strictest test of the WLC model But how do other 
scmiflcxible polymer models compare? Can these models also reproduce the precise 
fit to experiment? To answer these questions, we next compute the force-extension 
for general models and explicitly compare the extension in the SEC and WLC 
models (Fig. EJ. 

The partition function for a polymer under a constant external tension is re- 
lated to the Fourier transform of the spatial distribution function via an analytic 
continuation of the wave number: 

Z(f) = J d D x K(x; L) exp / • x (43) 

= C - 1 K(if,p), (44) 

L-+p 

where / is the external force or tension. The extension or mean end distance is 
computed in the usual way: 

/ \ dlogZ 

(x) = —Qj-- ( 45 ) 

The force-extension for the SECand WLC models are compared in Fig. EJ The 
numerical technique applied in this computation is described in Appendix iDl 

Despite the drastically different bending energy of the SEC model on short length 
scales, thermal fluctuations disguise these differences and give rise to an extension 
almost identical to the WLC model. In retrospect, these results are hardly sur- 
prising. The theories arc identical at small extension due to the renormalization 
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group and at large extension due to inextensibility. Although, in principle, the high 
force limit is mathematically equivalent to probing short length scales — they are 
related by analytic continuation — these differences are not large enough to be ex- 
perimentally observable. Physically, the rare high curvature bending regime, where 
the difference between the models is most pronounced, is further suppressed by the 
application of tension. For the study of DNA mechanics, force-extension measure- 
ments do probe the persistence length and the inextensibility of DNA, but these 
experiments do not effectively probe DNA elasticity on the length scales of interest 
for many biological processes. 



C. Structure factor 

Another experimental observable used to characterize polymers is the structure 
factor, measured by static light scattering, small-angle X-Ray scattering, and neu- 
tron scattering experiments. Measurements of the structure factor can probe the 
polymer configuration on a wide range of length scales. Symbolically the structure 
factor is 

S0) = ' dsds > ^HAs)-x(s')}^ j (46) 

where X(s) is the position of the polymer at arc length s and we have included 
an extra factor of the contour length in the denominator to make the structure 
factor dimensionless 1 1 . At high wave number, the structure factor is sensitive to 
short-length-scale physics, whereas the contour length and radius of gyration are 
probed by low wave number. The structure factor can be rewritten in terms of the 
transformed spatial distribution function 

S(k) = ^ £ 

where C^ 1 is the inverse Laplace transform which can be computed numerically. 
As mentioned above, the leading-order contributions at small wave number are 
determined by the polymer length and the radius of gyration 

LS{k) = L(l + \k 2 R 2 g + ...), (48) 

where we have temporarily restored the length dimension of S. At large k, both 
WLC and SECare straight, which gives an asymptotic limit for large wave number 



K(k;p) 



(47) 



S(k) -» Jl. (49) 

The structure factor is compared for the SEC and WLC models in Fig. [7\ The 
numerical technique applied in this computation is described in Appendix [dJ 
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FIG. 7: The structure factor for the SEC and WLC models. In the figure above, the 
structure factor S, scaled by the wave number, is plotted for several contour lengths. The 
curves are nearly identical for the two theories since the structure factor is dominated by 
thermally accessible configurations. Although rare, high-curvature configurations are or- 
ders of magnitude more probable in the SEC than in the WLC theory, these configurations 
are still too rare to significantly affect the structure factor. 



Again we find that the two theories make nearly identical predictions. The rea- 
soning is again similar to that explained for force-extension. The two theories 
make dramatically different predictions for rare, highly-bent configurations but the 
structure factor is dominated not by these rare high curvature configurations but 
by typical thermal bending. We therefore find that the structure factor, like force 
extension, does not effectively probe the high-curvature statistics of the polymer. 



D. Cyclization 

The biochemical process of DNA cyclization is not in itself a process of partic- 
ular biological importance 1 but cyclization experiments do provide a controlled, 
bulk experimental method for probing the probability of rare, highly-curved 
DNA configurations jf], Q, E3- ^ n these experiments, linear double-stranded se- 
quences with complementary single-stranded ends are ligated into cyclized se- 
quences 0, [3, EE I^H EI ■ The cyclization reaction precedes via the capture of 
rare, thermally activated configurations and is thought to be very similar to the 
process by which looped DNA-protein complexes are formed. Cyclization does have 
a very clear advantage over protein-induced DNA looping as a method of probing 
the high-curvature mechanics of DNA: the chain boundary conditions for cycliza- 
tion (tangents aligned) are well known, in marked contrast to most DNA-protein 



Bacteriophages are know to cyclize their genomes after ejection into a cell, but these genomes 
are typically many thousands of base pairs and the barrier to cyclization is purely entropic. 
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complexes where the relevant chain boundary conditions must be determined. 

The cyclization assay is performed under conditions such that the ligation re- 
action samples the equilibrium populations of unligated cyclized and oligomerized 
polymers [4|. The ratio of the cyclization equilibrium constant (Kc) to the dimer- 
ization equilibrium constant (Kd) is called the Jacobson-Stockmayer factor [3] 
or J factor and is proportional to the tangent-spatial distribution function of the 
polymer 

J = K C /K D = 4ttG(0; t, t; L) = tr 0(0; L), (50) 

where G is the tangent-spatial distribution function for end-to-end displacement 
and aligned end tangents, for a contour length L polymer. The J factor can 
also be written as the trace of the spatial propagator. (The matrix elements of the 
spatial propagator are written explicitly in Appendix ICl) Physically, the J factor is 
proportional to the concentration of one end at the other with the correct (aligned) 
orientation for hybridization. 

Our analysis neglects the condition that DNA twist must also be aligned, which 
requires the use of models including the twist degree of freedom. This additional 
constraint modulates the J factor with a 10.5 bp period equal to the helical repeat. 
Our interest here is in the value of the J factor averaged over a helical repeat for 
which the effects of twist can be roughly ignored [3^ . 

Fig. [H] compares the cyclization J factor for the SECand WLC theories. The 
numerical techniques applied in this computation are described in Appendix[D] The 
J factors for sequences with contour lengths greater than two persistence lengths 
have long been known to match the predictions of the WLC model 0, 12 • For 
sequences shorter than two persistence lengths, the figure illustrates the short- 
contour-length break down of the WLC model describing the chain statistics of the 
SEC model. For example, for contour lengths of roughly 0.6 persistence lengths, 
which correspond to loops with approximately the same radius of curvature as DNA 
bound to histones in nucleosome complexes, the SECmodel J factor is three orders 
of magnitude larger than predicted by the WLC model, in rough agreement with 
cyclization measurements of Cloutier and Widom 6] , as illustrated in Fig. [S] 

The qualitative picture illustrated in the Fig. |H] (the WLC model describes long- 
contour- length chain statistics, but fails at sufficiently short contour length) is 
the generic result from J factor computations in general models. These results 
were qualitatively predicted by the renormalization group ideas we have discussed 
throughout the paper. From an experimental perspective, the cyclization assay is 
clearly a powerful technique for probing the short-contour-length chain statistics of 
DNA. In particular, this technique has very clear advantages over force-extension 
and solution-scattering experiments since (i) cyclization assays probe the chain 
statistics of DNA in a way that is qualitatively similar to biological DNA looping 
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FIG. 8: The cyclization J factor: probing the high-curvature chain statistics. In the figure 
above, the cyclization J factor in units of molarity is plotted for the WLC (dashed blue 
curve) and SEC [dashed red curve) models and compared with experimental measurements 
(circles) 1 11 1(1 1371 IVI \ l'3 . The theoretical curves do not include the twist induced mod- 
ulation visible in the continuous sets of experimental data (solid curves) [33I I37L I3II l^ . 
The renormalization group predicts that the SECmodel will be identical to WLC for long- 
contour-length sequences. But, for sequences shorter than two persistence lengths (<200 
bp), the short-contour- length chain statistics become important and the SEC J factor di- 
verges from the WLC prediction. In fact, for 94 bp sequences, the SEC J factor is three 
orders of magnitude larger than that predicted by the WLC model, roughly matching the 
J factors measured by Cloutier and Widom [rl (red circles and solid curves) whereas 
subsequent measurements by Du et al. (blue solid curves) are commensurate with the 
predictions of the WLC model (dashed blue curves). Our results predict that a short- 
contour-length anomaly in the J factor is generic for sufficiently short sequences, but the 
contour lengths at which the WLC model breaks down is model dependent. 



applications and (ii) cyclization experiments are extremely sensitive to the differ- 
ences between models at short contour length. 

E. Beyond the J factor 

The J factor is not the only effective concentration of interest. DNA looping 
is integral to the function of many gene regulatory proteins. The affinity of these 
proteins for DNA, and therefore their function, depends sensitively on the looping 
free energy, or equivalently the effective concentration of the looped DNA. (For in- 
stance, see Rcfs. 

0, Q, and .) Once the geometry of the loop is known — the 
displacement of the binding sites (X) and the orientation of the bound DNA (t and 
t ') — both the SECand WLC models make predictions for the effective concentra- 
tion: 



[effective concentration] = 47rG(A; t, t';L). 



(51) 
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These statistical mechanics predictions can then be directly compared with quan- 
titative measurements of gene expression and in vitro experiments 1 161 . 

More general cyclization measurements may also be performed. For instance, 
cyclizing sequences with two short single-stranded gaps could be used to probe 
short-length-scale DNA mechanics. The short single-stranded sequences are very 
flexible and can be approximated as free hinges. This technique could be exploited 
to directly measure the spatial distribution function shown in Fig. [3] for very short 
sequences of DNA. 



IV. DISCUSSION 
A. The SEC 

Sects. ITToMlI Dl introduced the SEC as a toy model for DNA bending, motivated 
by several physical measurements on DNA. We proceeded to show that this simple 
model exhibited the long-length-scale chain statistics of the WLC model, despite 
dramatically increasing the predicted probability of high-curvature configurations. 
In particular, we showed that the SEC model yields a cyclization J factor in agree- 
ment with the measurements of Cloutier and Widom Q . More generally, we argued 
that this type deviation from WLC behavior is generic in semiflexible chain models. 

These putative deviations of DNA chain statistics from the wormlike chain model 
at short contour length are quite relevant for structural biology, where the typical 
radius of curvature induced by DNA binding proteins is on order nanometers or 
tens of nanometers, not persistence lengths. For example, the radius of curvature 
of DNA bound in a nucleosome complex is roughly 6 nm. The structure of this 
complex shows sharp bends, but no sign of melting, consistent with our SEC model 
01 • Similarly, DNA looped by a gene regulatory protein is typically bent on short 
length scales |43|. If DNA is described by the SECmodel, these tightly-bent DNA- 
protein complexes are orders of magnitude more stable than predicted by the WLC 
model. A quantitative understanding of biological DNA bending therefore awaits 
a consistent model of short-length-scale DNA bending. 

Unfortunately, precise quantitative tests of short-length-scale DNA bending are 
still in the future. Vologodskii and coworkers recently made measurements ques- 
tioning the results of Cloutier and Widom 7] . Their measurements suggest that the 
J factor agrees with that predicted by the WLC model, at least down to a contour 
length of 100 bp. Widom and coworker then repeated their own measurements, 
however, and have confirmed their previous results 48] . Also, Sect. Ill Dl mentioned 
that Shroff et al. also found that linear elasticity fails at high curvature. At the 
moment, it is difficult to reconcile all these conflicting experiments. Instead this 
paper has shown that existing experiments do not uniquely confirm the WLC; we 
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have examined some of the options for theories compatible with those experiments 
that appear to be understood. 

We have repeatedly emphasized that the SEC is more a proof of principle than 
a finished theory. It is a generalization of the WLC that is extremely compact 
to state and can be solved almost analytically. It shows that the classic successes 
of WLC can be reconciled with more recent indications of elastic breakdown. It 
encodes locality at the mesoscopic length scale I « 5 nm, but assumes that linear 
elasticity does not hold at that scale. Indeed, we do expect that linear elasticity 
will break down at length scales corresponding to the curvature radius at which 
the DNA duplex is not a minimum of free energy. We would not expect the usual 
duplex form to be stable when bent into a loop of radius 5 nm. 

The SEC's other, less realistic features, such as the neglect of sequence depen- 
dence, can readily be addressed, albeit at the cost of explicit solutions. Its bending 
energy function, however, is not meant to be a literal depiction of DNA mechanics. 
In principle, the true bending energy function can eventually be deduced from sta- 
tistical analysis of sufficiently accurate determinations of DNA contours, obtained 
either in solution via cryo EM, or when adsorbed to surfaces by AFM or EM. Alter- 
natively, the short-length-scale bending energy might be calculated using molecu- 
lar dynamics simulations. Direct all-atom molecular dynamics computations of the 
chain statistics for long-contour-length sequences of DNA are prohibitive computa- 
tionally, but the generalized polymer model described in this paper is based upon 
the chain statistics of short-contour-length links which may be directly simulated. 

B. Future directions 

For many biological applications of DNA chain statistics, the twist degrees of 
freedom are also of great importance. For instance for DNA looping, moving an 
operator (the DNA binding sequence) a, few base pairs can change the looping 
probability by an order of magnitude [43j |. This dramatic, short-contour- length de- 
pendence arises from the necessity of bring the DNA operator into twist registry 
with the binding site. The twist degree of freedom of DNA has also been described 
by a fluctuating elastic rod, the Helical Wormlike Chain model (HWLC) [3^]. At 
long length scales, this modified WLC model has successfully described the twist 
dependence of DNA. Nevertheless, at high enough strain the HWLC model breaks 
down. For example, Bryant et al. have demonstrated that the restoring torque gen- 
erated by twisted DNA saturates for high twist densities, implying that the linear 
elastic model breaks down when the undertwist \Au\ exceeds 0.01 radian/basepair 
[49( 1. The twist density needed to join a mis-phased DNA loop of under 100 bp 
exceeds this threshold, and indeed Cloutier and Widom have also shown that the 
twist-induced modulation of the cyclization J factor is smaller for short sequences 
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than predicted by the HLWC model [gj . 

Thus, although the bending of DNA for small twist densities may be adequately 
described by the HWLC model, a generalized model of DNA, including elastic 
breakdown of both bend and twist stiffnesses, may be necessary to describe the 
chain statistics of short sequences of looped DNA that are not naturally in twist 
registry when bound. Such generalized models are in principle a straightforward 
extension of the theory presented in this paper and new exact results for the HWLC 
model recently derived by Spakowitz 14]. 



V. CONCLUSION 



In this paper, we have explored a class of generalized scmiflexiblc polymer models 
in which the bending energy density is an arbitrary function of curvature. To ana- 
lyze the chain statistics of these models, we developed a formalism that is analogous 
to the techniques used for describing the WLC model. We demonstrated that the 
statistics of these general models coincide with those of the linear-elastic (WLC) 
model at long contour length, as predicted by the renormalization group. At short 
length scales, we show that the predictions of these models can be dramatically 
different from the WLC model. We computed near-exact expressions for the trans- 
formed spatial and tangent-spatial distribution functions with a method analogous 
to that recently exploited to find exact results for the WLC model. These gen- 
eralized models provide an explicit example of a non-renormalizable model which 
is nearly exactly solvable. We exploited these general theoretical results to com- 
pute several important experimental observables: force-extension, the structure 
factor, and the cyclization J factor. We explicitly performed these computations 
for a toy model of DNA bending, the Sub-Elastic Chain (SEC) model. The pre- 
dictions of this model are essentially indistinguishable from the WLC model for 
force-extension, solution scattering, and long-contour-length cyclization measure- 
ments, despite dramatic differences between the bending energies of the two models 
on short length scales. For short-contour-length cyclization experiments, general 
models generically predict large deviations from WLC behavior. In particular we 
computed the J factor for the SECmodel and showed that this model could account 
for the anomalously large cyclization J factor measured by Widom and Cloutier 
0. We expect these generalized models to be widely applicable for describing the 
high-curvature statistics of other semiflexible polymers. 
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APPENDIX A: EXPLICIT EXPRESSIONS FOR gi 

It is straightforward to determine the gi eigenvalues of any propagator using the 
orthonormal eigenbasis of the angular momentum representation. In two dimen- 
sions, the gi are 



/7T 
-7T 



9l = / d9 g(t(9);e z ) expiW, (Al) 
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where 9 is defined as the angle away from the z axis: i(0) — e z . In three dimensions, 
the gi are 



gi = J <rtg(t(9);S z )Pt(t-e,), (A2) 
where the Pi are the Legendre Polynomials and cos 9 — t ■ e z . 

APPENDIX B: STIFF POLYMER LIMIT 

In this section, we show that a narrowly distributed fundamental tangent dis- 
tribution function generically implies WLC statistics at long contour length. In 
dimension D, this calculation, though straight forward, requires some technical 
mathematics, but these technical details are not important for the interpretation 
of the result. 

We begin the derivation with the definition of the Ith moment of the tangent 
distribution function expressed in terms of the propagator Eq. 1191 

gi = (H g \lm) (Bl) 

where rigid-body-rotational invariance implies that g\ is independent of m. We 
insert two complete sets of states into the tangent representation 

gi = J dtdt' (Im \t) (t\ G \t') (t'\lm) . (B2) 

We can now replace the matrix element of the propagator with the fundamental 
tangent distribution function g(t;t') fEa. 113(1 . Remember that this function de- 
pends only on the relative deflection angle of the tangents. We therefore replace 
the integral over the second tangent with an integral over rotation matrices, 1Z, and 
make the substitution t' = IZt: 



fji 



/dt' 
dtdll — (Im \t) g{t\ Hi) (t\ V^ n \lm) , (B3) 



where we represent the change in measure symbolically and we have introduced the 
rotation operator 

V n \t) = \Kt). (B4) 

Our interest is in the case where the tangent distribution function is narrowly 
distributed. We shall therefore expand the rotation operator, T>, with respect to 
the rotation angles which we shall assume are small. The rotation operator can be 
expanded in terms of these angles and the rotation generators 31] 

V n = cxp(-i%£y) (B5) 
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where the 8ij — —Oji are the components of the rotation angle which multiply the 
generators of rotations in the ij plane. 

To evaluate the integral over the rotation matrices, we must now choose a set of 
0's which give a single cover of the tangent space. Since g(t;lZt) is independent of 
t, it is convenient to choose a coordinate system in which t is in the direction of 
the D axis. (We shall return to the unrotated frame before performing the integral 
over t.) In this new coordinate system, it is convenient to use the cover generated 
by the coordinates {0m}i..D-i while setting all other 0's to zero. 

We denote the average taken with respect to the distribution function by ( ). 
Due to rigid-body-rotational invariance around the D axis, 

(6w) = 0, (B7) 
(BwOnD) = (0 2 }S in /(D-l), (B8) 

where 

D-l 
i=l 

is the total deflection angle. 

The nonzero matrix elements can be put in a coordinate invariant form 

(lm\ e D ) (e D \ C D iC Di \lm) = (lm\ e D ) (e D \ C 2 \lm) (BIO) 

since the added terms in the Casimir operator, C 2 , are zero on |eo). We can now 
go back to the unrotated coordinate system by setting eb = t. 

After integrating over the complete set of tangent vectors, the resulting moment 

is 

gi = 1 - \{D - l)- 1 If) <Zm| C 2 \lm) + 0(C 4 (9 4 )). (Bll) 

Since this expression is only correct to O(0 4 ), it is convenient to replace \Q 2 with 
1 — cos 8. We can now use the definition of the persistence length given in Eq. |2E1 
to eliminate the dependence on (cos 8): 

91 = 1- ^ (lm\ C 2 \lm) + 0(CH 2 H 2 )- (B12) 

Finally, we reconstruct the propagator from its moments 

G = J2 9l\lm) (l m| = 1 - ^-C 2 + 0(CH 2 /e), (B13) 

l,m * 

which completes the derivation. This result is discussed in Sect. Ill Hi 

APPENDIX C: THE TRANSFORMED SPATIAL PROPAGATOR 

To derive closed form expressions for the spatial propagator, we Fourier Trans- 
form the spatial propagator over the relative displacement, X. In particular, we 
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consider the Fourier Transform of Eq. |23 since in Fourier space, the spatial convo- 
lutions are simply products: 

g(k;L + L') = g(k;L)g(k;L'). (CI) 

We choose a coordinate system where k is in the z direction. 

We now wish to use this composition property of the spatial propagator to write 
a differential equation for Q. We therefore consider Q for a differential arc length 
dL and then expand the Fourier Transform of Eg. 1331 for arc length dL: 

g(k;dL)=l-AdL, (C2) 

where X is the identity operator and A = TL + ik cos 9 where 9 takes its canonical 
meaning in spherical coordinates: cosG = t ■ z. Substituting this expression into 
Eq. IC1I we can write a differential equation for g : 

^■g(k;L) = -Ag(k;L). (C3) 

It is now convenient to make a Laplace transform from arc length L to its conjugate 
variable p. After solving for the propagation operator, we have an operator equation 
for the Laplace-Fourier Transform of the spatial propagator: 

§(k;p) = {pl + Aik)}- 1 = { P 2 + H + ikcos9}- 1 , (C4) 

but this expression is not explicit since it is written in terms of the inverse of an 
infinite dimensional operator. 

We can express cos in the angular momentum basis. It is most convenient to 
define a set of ladder operators: 

cos 9 = a+ + a_, (C5) 
where the ladder operators are defined by 

oo / 



a+ = X A i+hl,m \l + lm) (I m| , (C6) 

1=0 m=-l 
oo / 

a- = X! A U+i,m \l m) {I + 1 m| , (C7) 



and the ^4/,;+i, m are: 



A M+1 , m - A l+U , m y (2Z + i)(2L + 3) ' (C8) 

The ladder operators have the property that they increase (decrease) the total 
momentum quantum number of a state by plus (minus) one. 

Next, we obtain explicit expressions for the matrix elements of the transformed 
spatial propagator. The Hamiltonian is diagonal in the angular representation, 
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so it is convenient to factor the spatial propagator (Eq. IC4|I into diagonal and 
nondiagonal factors: 

G{k:p) = [l+{ P I + H}- 1 ik(a+ + a-)y 1 { P I + H}-\ (C9) 

and expand it in a power series 

g(k;p) = J2 [-ik{pl + H}- 1 (a + + a_)] n {pl + H}- 1 . (CIO) 

71 = 

As a first step, we will compute a diagonal matrix element: 

Glmlrn — (lrn\g(k;p)\lm). (Cll) 

Computing these matrix elements is achieved by grouping the infinite set of terms 
in Eq. IC10I into sub sets which can be summed exactly |l2j • 

We introduce G^ mVm which is the matrix element of a subset of the terms in 
Ea. lClOl in which there are only transitions to states with total momentum I — I' or 
greater 12|. This matrix element can be defined recursively since only transitions 
to adjacent states are possible. The matrix element is the sum over n of the matrix 
elements with n transitions to and from the I > I' + 1 states, which can be written 



in terms of G 



V + l,m,l' + l 



can be summed exactly 



The terms of this matrix element, a geometric series, 



p + hi 



E 



1,2 a2 n+ 



P + h 



p + hi + k 2 A 2 ,^ m G 



+ l+l,m,l+l,m 



(C12) 
(C13) 



This sum is pictured schematically in Fig. 

Similarly, we define G~[, mVm which is the matrix element of the propagation op- 
erator which allows transitions to states with total momentum / = V or less: 



Gl 



p + hi + tfAl^^G^ , , 



(C14) 



In terms of G^ we can now define the matrix element without transition restrictions 
by grouping the transitions into sets that do not cross 1 = 1'. These sets can be 
written in terms of the matrix elements of and then summed in a geometric 
series 13]: 

-l 



G 



lm In 



l2 A 2 f,+ 
ft yl /,i + l,m LT Z+l,m,i+l,m 



h 2 a 2 n~ 

ft ^1,1-1,171^1-1,771,1-1^ 



(C15) 



The diagonal matrix element computed above is sufficient for describing many ob- 
servables of phenomenological interest. Note that the only difference between this 
expression and the WLC expression jl^J is that the eigenvalues of the Hamiltonian 
operator have changed. 
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a: 



l+lm 
I in 



p + hi 

ikAn+im 

r + 

" Imlm 



b: 



lm • 



lm • 



_i_ l+lm 
^ lm 



_i_ Z+l m 



FIG. 9: Diagramatic rules for the propagator: diagrams and their algebraic representa- 
tions. Connected diagrams represent the products of the corresponding algebraic repre- 
sentations. The matrix element of the spatial propagator Gi m i' m is the sum of all diagrams 
which begin at state I m and end at state I' m with an arbitrary number of intermediate 
transitions, (a) Horizontal lines represent propagation. Vertical lines represent transitions 
induced by the wave number. Gf mlm is the matrix element of the spatial propagator 
where transitions to states with total angular momentum I — 1 or smaller are forbidden. 
This matrix element is represented by the line with ellipses, representing all transitions 
to states with higher I. (b) Gf mlm can be defined recursively in terms of G^ +1 m i+1 m . 
The definition of G~[ mlm is analogous, but it is the sum of all diagrams with transitions to 
states with total angular quantum number / and smaller. 



For some applications we will want completely general matrix elements Gi m y m ' ■ 
We can again define these general matrix elements in terms of the recursive defi- 
nitions of . Again, the trick to summing the terms is grouping them. In this 
general case, there are many equivalent ways of achieving this grouping. Sec Fig. 1101 
for an explanation of the set grouping. The matrix element can be written [13l | 

n 

Gl-\-n,m,l,m' Gl^ m ,l+n,m' 3m — 7n'Gl m l m J^J ikA[+q— l,/-j-g,7n ^/-j-g^m.Z-l-^m ' 

9=1 

(C16) 

We have now explicitly solved for spatial propagator having written expressions for 
all the matrix elements. 



APPENDIX D: THE COMPUTATION OF SPATIAL DISTRIBUTIONS 

The previous section discussed near-exact expressions for the Fourier-Laplace 
transformed spatial and tangent-spatial distribution functions. Exact closed-form 
expressions for these functions are unknown and we must invert the transforms 
numerically to compute the distribution functions. 

1. Force-extension and the structure factor 

The computations of force extension and the structure factor require only a single 
numerical inverse Laplace Transform. We cut off the continued fraction at I = 10 



3G 



I + 3 



/ 



FIG. 10: General matrix elements. A diagram of the sum for the matrix element 
Gimi+nm = Gi+nmim- To compute the matrix element, we group the terms by the lo- 
cation of the first steps from I + n to I + n — 1 and from I + n — 1 to I + n — 2 etc. In the 
diagram, these steps are represented by the vertical lines. We use the Q + operator to sum 
over all possible diagrams with upward transitions between these steps. These upward 
transitions are represented by the ellipses. We multiply by the transition matrix element 
for each of the vertical lines. After we reach I for the first time we allow all transitions up 
or down. This enumeration counts each contributing diagram once but this recipe is not 
unique. 



and then used the InverseLaplaceTransf orm function in Mathematica. 

1. The spatial distribution and the J factor 

For computations of the spatial distribution function and the J factor, we ex- 
ploited two different numerical techniques: numerical transform inversion and 
Monte Carlo. For contour lengths of a persistence length and above, it is con- 
venient to directly invert the transforms numerically by truncated the continued 
fraction in the transformed propagator fEa. l(J15|) . Typically we used I — 15 as the 
cutoff although in some cases higher I values were used for short contour lengths. 

In the inverse transform technique, both numerical Laplace and Fourier Trans- 
form inversions must be computed. We have used two different implimentations for 
these computations, (i) In Mathematical we used the InverseLaplaceTransf orm 
function. We then integrated numerically (using an explicit sum) to invert the 
Fourier transform. We found that the built-in numerical integration in Mathemat- 
ica was too slow for practical use. (ii) In Matlab, we used a code which explicitly 
computed the Laplace Transform by computing the sum of the residues of the in- 
verse Laplace Transform contour integral. The Fourier Transform inversion was 
again performed by numerical integration using an explicit sum. The Matlab code 
was based on one shared with us by Andy Spakowitz. 

For contour lengths on order a persistence length and shorter, inverting the trans- 
formed expressions is impractical. The continued fraction in increasing momen- 
tum is essentially an expansion around weak end-tangent correlation. For contour 
lengths shorter than a persistence length, a larger I cutoff is required, significantly 
slowing the numerical inversions. In addition, the numerical integration over the 
wave number becomes impractical since the numerical integrations must be ex- 
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tended to very a large cutoff momentum. These convergence issues are not unique 
to the continued fraction approach. For example, the transfer matrix approach is 
plagued by similar shortcomings, requiring difficult numerical work at short contour 
length Q. 

We therefore used a much simpler, although less elegant, solution in the form 
of direct Monte Carlo integrations. Monte Carlo integration in the short-contour- 
length regime (i) is numerically more efficient than direct inversion, (ii) requires 
very minimal implementation, and (Hi) serves as a useful check of our theoretical 
results. These checks appear few places explicitly in the paper since the agreement 
between these two methods is excellent and the focus of this paper is physics rather 
than numerical computations. The theoretical curve for the cyclization J factor 
(Fig. |SJ contain both inversion and Monte Carlo computations. 



